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Abstract. We develop an approach through geometric functional analysis to 
error correcting codes and to reconstruction of signals from few linear measure- 
ments. An error correcting code encodes an n-letter word x into an m-letter word 
y in such a way that x can be decoded correctly when any r letters of y are 
corrupted. We prove that most linear orthogonal transformations Q : R" — > R m 
form efficient and robust robust error correcting codes over reals. The decoder 
(which corrects the corrupted components of y) is the metric projection onto the 
range of Q in the li norm. An equivalent problem arises in signal processing: 
how to reconstruct a signal that belongs to a small class from few linear mea- 
surements? We prove that for most sets of Gaussian measurements, all signals of 
small support can be exactly reconstructed by the Li norm minimization. This 
is a substantial improvement of recent results of Donoho and of Candes and Tao. 
An equivalent problem in combinatorial geometry is the existence of a polytope 
with fixed number of facets and maximal number of lower-dimensional facets. We 
prove that most sections of the cube form such polytopes. 



1. Error correcting codes and transform coding 

Error correcting codes are used in modern technology to protect information from 
errors. Information is formed by finite words over some alphabet F. An encoder 
transforms an n-letter word x into an m-letter word y with m > n. The decoder 
must be able to recover x correctly when up to r letters of y are corrupted in any 
way. Such an encoder-decoder pair is called an (n,m,r)- error correcting code. 

Development of algorithmically efficient error correcing codes has been attracting 
attention of engineers, computer scientists and applied mathematicians for past five 
decades. Known constructions involve deep algebraic and combinatorial methods, 
see ED [SSI- This paper develops a new approach to error correcting codes 
from the viewpoint of geometric functional analysis (asymptotic convex geometry). 
Our main focus will be on words over the alphabet F = R or C. In applications, 
these words may be formed of the coefficients of some signal (such as image or audio) 
with respect to some basis or overcomplete system (Fourier, wavelet, etc.) Finite 
alphabets will be discussed in Sectional 
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The simplest and most natural way to encode a vector x £ W 1 into a vector 
y £ K m is of course a linear transform 

y = Qx (1.1) 

where Q is given by an to x n matrix. Elementary linear algebra tells us that 
if to > n + 2r and the range of Q is generic 1 then x can be recovered from y 
even if r coordinates of y are corrupted. This gives an (ra, to, r)-error correcting 
code. However, the decoder for this code has a huge computational complexity, as 
it involves a search through all r-element subsets of the components of y. Then the 
problem is: 

How to reconstruct a vector y in an n- dimensional subspace Y ofW m 
from a vector y' £ M m that differs from y in at most r coordinates? 

What complicates this problem is the arbitrary magnitude of errors in each corrupted 
component of y' , in contrast to what happens over finite alphabets such as F = {0, 1}. 

A traditional and simple approach to denoising y' , used in applications such as 
signal processing, is the mean least square (MLS) minimization. One hopes that y 
is well approximated by a solution to the minimization problem 

min llu — 2/II2 (MLS) 

where H^Hl = \xi\ 2 . The solution to (MLS) is simply the orthogonal projection 
of y' onto Y. This of course can not recover y exactly, and even the approximation 
is typically poor since we have no control of the magnitude of the errors in the 
corrupted coordinates. A promising alternative approach is the Basis Pursuit (BP). 
We simply replace the 1-norm by the 2-norm and expect y to be the exact and 
unique solution to the minimization problem 

min ||u — y'\\\ (BP) 

where ||x||i = Yl% \ x i\- Thus a solution to (BP) is the metric projection of y' onto Y 
with respect to the 1-norm. (BP) be cast as a Linear Programming problem, and 
can be attacked with a variety of methods, such as the classical simplex method or 
more recent interior point methods that yield polynomial time algorithms 4 . 




(MLS) (BP) 

that is, in general position with respect to all subspaces R 7 , \I\ — r 
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The potential of Basis Pursuit for exact reconstruction is illustrated by the fol- 
lowing heuristics, essentially due to The solution u to (MLS) is the contact 
point where the smallest Euclidean ball centered at y' meets the subspace Y. That 
contact point is in general different from y. The situation is much better in (BP): 
typically the solution coincides with y. The solution u to (BP) is the contact point 
where the smallest octahedron centered at y' (the ball with respect to the 1-norm) 
meets Y. Because the vector y — y' lies in a low-dimensional coordinate subspace, 
the octahedron has a wedge at y. Thus, many subspaces Y through y will miss 
the octahedron of radius y — y' (as opposed to the Euclidean ball). This forces the 
solution u to (BP), which is the contact point of the octahedron, to coincide with y. 

The idea of using the 1-norm instead of the 2-norm for better data recovery has 
been explored since mid-seventies in various applied areas, in particular geophysics 
and statistics (early history can be found in [Ml)- With the subsequent develop- 
ment of fast interior point methods in Linear Programming, (BP) turned into an 
effectively solvable problem, and was put forward more recently by Donoho and his 
collaborators, triggering massive experimental and theoretical work [IJ 1171 1181 1191 

U3 Hi El Ei HE H3 EH HU 113 EE El El • 

The main result of this paper validates the Basis Pursuit method for most sub- 
spaces Y under an asymptotically sharp condition on m,n,r. We thus prove that 
the Basis Pursuit yields exact reconstruction for most subspaces Y in the Grass- 
manian. The randomness is with respect to the normalized Haar measure on the 
Grassmanian G m>n of n-dimensional subspaces of R m . Positive absolute constants 
will be denoted throughout the paper by C, c, C%, 

Theorem 1.1. Let m, n and r < cm be positive integers such that 

m = n + R, where R > Cr\og{m/r). (1-2) 

Then a random n-dimensional subspace Y in M m satisfies the following with prob- 
ability at least 1 — e~ cR . Let y G Y be an unknown vector, and we are given a 
vector y' in M. m that differs from y on at most r coordinates. Then y can be exactly 
reconstructed from y' as the solution to the minimization problem (BP). 

In an equivalent form, this theorem is a substantial improvement of recent results 
of Donoho ^01 and of Candes and Tao |Sj, see Theorem 12 . II below. 

1.1. Error correcting codes. Theorem 11.11 implies a natural (n, m, r)-error cor- 
recting code over M. The encoder (jl,l|) is given by an m x n random orthogonal 
matrix 2 Q. Its range Y is a random n-dimensional subspace in R m . The decoder 
takes a corrupted vector y', solves (BP) and outputs Q T u = Q~ 1 u. Theorem 11.11 
states that under the assumption 1)1. 2J) . this encoder-decored pair is an (n, m, r)-error 
correcting code with exponentially good probability > 1 — e~ cR . 



one can view it as the first n rows of a random matrix from O(m) equipped with the normalized 
Haar measure. 
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1.2. Sharpness. The sufficient condition p. 2(1 is sharp up to an absolute constant 
C (see Section EJ and is only slightly stronger than the necessary condition m > 
n + 2r. The ratio e = r/m in ()1.2(l is the number of errors per letter in the noisy 
communication channel that maps y to y'. Thus e should be considered as a quality 
of the channel, which is independent of the message. Thus (|1.2() is equivalent to 

m> (l + Ce log — ^ n. 

1.3. Robustness. An natural feature of our error correction code is its robustness. 
Simple linear algebra yields that the solution to (BP) is stable with respect to the 

1- norm - in the same way as the solution to (MLS) is stable with respect to the 

2- norm, see j5]. Such robustness allows in particular quantization of the messages. 
This immediately yields error correcting codes for finite alphabets, see Section |5J 

1.4. Transform coding. In the signal processing, the linear codes (|1.1|) are known 
as transform codes. The general paradigm about transform codes is that the redun- 
dancies in the coefficients of y that come from the excess of the dimension m > n 
should guarantee a stability of the signal with respect to noise, quantization, era- 
sures, etc. This is confirmed by an extensive experimental and some theoretical 
work, see e.g. El 123 El E] and the bibliography contained therein. 
Theorem 1 1 . 1 1 st at es that most orthogonal transform codes are good error- correcting 
codes. 

Acknowledgement. This work has started when the second author was visiting 
University of Missouri-Columbia as a Miller Visiting Scholar. He is grateful to UMC 
for the hospitality. 

2. Reconstruction of signals from linear measurements. 

The heuristic idea that guides the Statistical Learning Theory is that a function 
f from a small class should be determined by few linear measurements. Linear 
measurements are generally given by some linear functionals in the dual space, 
which are fixed (in particular are independent of /). Most common measurements 
are point evaluation functionals; the problem there is to interpolate / between known 
values while keeping / in the known (small) class. When the evaluation points are 
chosen at random, this becomes the 'proper learning' problem of the Statistical 
Learning Theory (see |31j). 

We shall however be interested in general linear measurements. The proposal to 
learn / from general linear measurements ('sensing') has been originated recently 
from a criticism of the current methodology of signal compression. Most of real life 
signals, such as images and sounds, seem to belong to small classes. This is because 
they carry much of unwanted information that can be discarded with almost no 
perceptual loss, which makes such signals easily compressible. Donoho ^2j then 
questions the conventional scheme of signal processing, where the whole signal must 
be first acquired (together with lots of unwanted information) and only then be 
compressed (throwing away the unwanted part). Instead, can one directly acquire 
('sense') the essential part of the signal, via few linear measurements? Similar issues 
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are raised in [H] . We shall operate under the assumption that some technology allows 
us to take linear measurements in certain fixed 'directions' X^. 

We will assume that our signal / is discrete, so we view it as a vector in R m . Sup- 
pose we can take linear measurements (/, Xk) with some fixed vectors Xi, X2, • • • , Xr 
in M m . Assuming that / belongs to a small class, how many measurements R are 
needed to reconstruct /? And even when we prove that R measurements do deter- 
mine / (uniquely or approximately), the algorithmic issue remains unsettled: how 
can one reconstruct / from these measurements? 

The previoous section suggests to reconstruct / as a solution to the Basis Pursuit 
minimization problem 

min||g||i subsect to (g, Xk) = (f, Xk), k=l,...,R. (BP') 

For the Basis Pursuit to work, the vectors X^ must be in a good position with 
respect to all coordinate subspaces M. 1 , \I\ < r. A typical choice for such vectors 
would be the independent standard Gaussian vectors 3 Xk. 

2.1. Functions with small support. In the class of functions with small support, 
one can hope for exact reconstruction. Candes and Tao [H] have indeed proved that 
every fixed function / with support |supp/| < r can indeed be recovered by (BP'), 
correctly with the polynomial probability 1 — m~ const , from the R = Cr log m Gauss- 
ian measurements. However, the polynomial probability is clearly not sufficient to 
deduce that there is one set vectors Xk that can be used to reconstruct all functions 
/ of small support. 

The following equivalent form of Theorem 11.11 does yield a uniform exact recon- 
struction. It provides us with one set of linear measurements from from which we 
can effectively reconstruct every signal of small support. 

Theorem 2.1 (Uniform Exact Reconstruction). Let m, r < cm and R be positive 
integers satisfying R > Cr\og(m/r). The independent standard Gaussian vectors 
Xk in R m satisfy the following with probability at least 1 — e~ cR . Let f £ W 71 be an 
unknown function of small support, |supp/| < r, and we are given R measurements 
(/, Xk). Then f can be exactly reconstructed from these measurements as a solution 
to the Basis Pursuit problem (BP '). 

This theorem gives uniformity in Candes- Tao result [8], improves the polynomial 
probability to an exponential probability, and improves upon the number R of mea- 
surements (which was R > Cr log m in |S]). Donoho proved a weaker form of 
Theorem 12. II with R/r bounded below by some function of m/r. 

Proof. Write g = f — u for some u £ W 11 . Then (BP') reads as 

min \\u — /||i subsect to {u, X^) = 0, k = l,...,R. (2-1) 

The constraints here define a random (n = m — i?)-dimensional subspace Y of R m . 
Now apply Theorem 11.11 with y = and y' = f. It states that the unique solution 
to (|2.1j) is u = 0. Therefore, the unique solution to (BP') is /. ■ 



3 A11 the components of Xk are independent standard Gaussian random variables. 
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2.2. Compressible functions. In a larger class of compressible functions we 
can only hope for an approximate reconstruction. This is a class of functions / that 
are well compressible by a known orthogonal transform, such as Fourier or wavelet. 
This means that the coefficients of / with respect to a certain known orthogonal 
basis have a power decay. By applying an appropriate rotation, we can assume that 
this basis is the canonical basis of M. m , thus / satisfies 



where /* denotes a nonincreasing rearrangement of /. Many natural signals are 
compressible for some < p < 1, such as smooth signals and signals with bounded 
variations (see 0), in particular most photographic images. Theorem l2.1l implies. by 
the argument of [S], that functions compressible in some basis can be approximately 
reconstructed from few fixed linear measurements: 

Corollary 2.2 (Uniform Approximate Reconstruction). Let m and r be positive 
integers. The independent standard Gaussian vectors in M. m satisfy the following 
with probability at least l — e~ cR . Assume that an unknown function f G R m satisfies 
either (|2.2|) for some < p < 1 or < 1 for p = 1. Suppose that we are given 
R measurements (/, X^). Then f can be approximately reconstructed from these 
measurements: a unique solution g to the Basis Pursuit problem (BP ) satisfies 



where C p depends on p only. 

This theorem also gives uniformity in another Candes-Tao result from [S] (see also 
jllj): it improves the polynomial probability to an exponential probability, and also 
improves upon the approximation error. 



Theorem 1 1 1 1 1 turns out to be equivaent to a problem of counting lower-dimensional 
facets of polytopes. Let B™ denote the unit ball with respect to the 1-norm; it is 
sometimes called the unit octahedron. The polar body is the unit cube B™ = 
[— l,l] m . The conclusion of Theorem 11.11 is then equivalent to the following state- 
ment: the affine subspace z + Y is tangent to the unit octahedron at point z, where 
z = y' — y. This should happen for all z from the coordinate subspaces M 1 with 
| J | = r. By the duality, this means that the subspace Y 1 - intersects all (m — r)- 
dimensional facets of the unit cube. The section of the cube by the subspace Y 1 - 
forms an origin-symmetric polytope of dimension R and with 2m facets. 

Our problem can thus be stated as a problem of counting lower-dimensional facets 
of polytopes. 

Consider an R- dimensional origin symmetric polytope with 2m facets. 
How many (R — r)- dimensional facets can it have? 



f*(s)< s -VP, s = l,... 



m 



(2.2) 




3. Counting low-dimensional facets of polytopes. 
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Clearly 4 , no more than 2 r (™) . Does there exist a polytope with that many facets? 
Our ability to construct such a polytope is equivalent to the existence of the efficient 
error correcting code. Indeed, looking at the canonical realization of such a polytope 
as a section of the unit cube by a subspace Y -1 , we see that Y 1 - intersects all 
the (m — r)-dimensional facets of the cube. Thus Y satisfies the conclusion of 
Theorem ll.il We can thus state Theorem II .11 in the following form: 

Theorem 3.1. There exists an R- dimensional symmetric polytope with m facets 
and with the maximal number of (R — r) -dimensional facets (which is 2 r (™) ), pro- 
vided R > Cr log(m/r). A random section of the cube forms such a polytope with 
probability 1 — e~ cR . 

So, how can we prove that a random subspace Y 1 - indeed intersects all the (m—r)- 
dimensional facets of the cube? It is enough to show that Y 1 - intersects one such fixed 
facet with exponential probability (bigger than 1 — 2~ r (™) ). The main difficulty 
here is that the concentration of measure technique can not be readily applied. This 
is because the oo-norm defined by the unit cube (more precisely, by its facet) has 
a bad Lipschitz constant. To improve the Lipschitzness, we first project the facet 
onto a random subspace (within its affine span); the random subspace parallel to 
which we project is taken from the random directions that form Y 1 - . This creates a 
big Euclidean ball inside the projected facet; here we shall use the full strength of 
the estimate of Garnaev and Gluskin 20 on Euclidean projections of a cube. The 
existence of the Euclidean ball inside a body creates the needed Lipschitzness, so 
we can now use the concentration of measure tecnique. 

The rest of the paper is organized as follows. In Section we prove Theorem ll.il 
In Section we discuss some optimality and robustness of the Basis Pursuit with 
applications to error correcting codes over finite alphabets. 

4. Proof 

We shall use the following standard notations throughout the proof. The p- 
norm (1 < p < oo) on M m is defined by = Y2i l x il p > an d for p = oo it is 

Halloo = maxj \xi\. The unit ball with respect to the p-norm on M n is denoted by 
B™. When the p-norm is considered on a coordinate subspace M 7 ", I C {1, . . . ,m}, 
the corresponding unit ball is denoted by Bp*. 

The unit Euclidean sphere in a subspace E is denoted by S{E). The normalized 
rotational invariant Lebesgue measure on S{E) is denoted by o~e- The orthogonal 
projection in onto a subspace E is denoted by Pg. The standard Gaussian measure 
on E (with the identity covariance matrix) is denoted by jh- When E = M d , we 
write Od-\ for erg and -yd for j&. 

4.1. Duality. We begin the proof of Theorem 11.11 with a typical duality argument, 
leading to the same reformulation of the problem as in [SJ. We claim that the 



Any such facet is the intersection of some r facets of the polytope of full dimension R — 1; there 
are m facets to choose from, each coming with its opposite by the symmetry. 
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conclusion of Theorem 11.11 follows from (and is actually equivalent to) the following 
separation condition: 

(z + Y) n interior (£f ) = for all z G \J B{. (4.1) 

|/|=r 

Indeed, suppose (|4.1[) holds. We apply it for 

y-y' 



\y - y'h 



noting that z G U|/|= r ^i holds, because y and y' differ in at most r coordinates. 
By 63J), 

(z + v)n interior (B™) = for all v G Y 

which implies 

II -z + v\\\ > 1 for all v G Y. 

Let u G Y be arbitrary. Using the inequality above for v := pzi^jn » we conclude 
that 

1 1 -zx — y||i > 1 1 2/ — y ||l for all u £ Y. 

This proves that y is indeed a solution to (BP). The solution to (BP) is unique with 
probability 1 in the Grassmanian. This follows from a direct dimension argument, 
see e.g. jS]. 

By Hahn-Banach theorem, the separation condition 14.11 is equivalent to the fol- 
lowing: for every z G U|j|=r boundary B[ there exists w = w(z) G Y 1 - such that 

{w,z) = sup {w,x) = ||w||oo- 



This holds if and only if the components of w satisfy 

(4.2) 



sign(zj) for j G I, 



Wj\ < 1 for j G I c . 

The set of vectors w in W 71 that satisfy (|4,2|) form a (m — r)-dimensional facet of 
the unit cube B^. Then with E := Y 1 - we can say that the conclusion of Theorem 
11.11 is equivalent to the following: 

A random R- dimensional subspace E in IR m intersects all the (m—r)- 
dimensional facets of the unit cube with probability at least 1 — e~ cR . 

It will be enough to show that E intersects one fixed facet with the probability 
1 — e~ cR . Indeed, since the total number of the facets is N = 2 r (™), the probability 
that E misses some facet would be at most Ne~ cR < e~ ClR with an appropriate 
choice of the absolute constant in (11.21). 
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4.2. Realizing a random subspace. We are to show that a random i?-dimensional 
subspace E intersects one fixed (m — r)-dimensional facet of the unit cube B™ with 
high probability. Without loss of generality, we can assume that our facet is 

F = {(toi, . . .,w m - r , 1,. . . , 1), all \wj\ < 1}, 

whose center is 

e = (o 1 _^o,i,...,i). 

m—r 

The probability we are interested in is 

P:= Prob{£nF/0}. 
We shall restrict our attention to the linear span of F, 

lin(F) = {(wi, . . . ,w m - r ,t, ...,t), all Wj £ R, t G R}, 
and even to its the affine span of F, 

aft(F) = {(wi, . . .,w m - r , 1, . . . , 1), all wj e R}. 
Only the random affine subspace E n aff(F) matters for us, because 

P = Prob{(£ n aff(F)) n F / 0}. 

The dimension of that affine subspace is almost surely 

I := dim(£ n aff(F)) = R - r. 

We can realize the random affine subspace EC\a&(F) (or rather a random subspace 
with the same law) by the following algorithm: 

(1) Select a random variable D with the same law as dist(0, E n aff (F)). 

(2) Select a random subspace Lq in the Grassmanian G m - r j. It will realize the 
"direction" of E n aff(F) in aff(F). 

(3) Select a random point z on the Euclidean sphere D ■ S(Lq) of radius D, ac- 
cording to the uniform distribution on the sphere. Here Lq is the orthogonal 
complement of Lq in M m ~ r \ The vector z will realize the distance from the 
affine subspace E n aff (F) to the center 9 of F. 

(4) Set L = 9 + z + Lq. Thus the random affine subspace L has the same law as 
EDaS(F). 
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Hence 

P = Prob{L n F ^ 0} = Prob{(z + L ) D P™~ r / 0} = Prob{z G P i J.S^ r }. 

i? := Lq is a random subspace in G m _ rim _ r _; = G m - r ^ m -R. By the rotational 
invariance of z G D • S(H), 

P= I f a H (D- l P H B™- r ) dv(H) d/x{D) (4.3) 

where f is the normalized Haar measure on G m —rjn— r and /x is the law of D. We 
shall bound P in two steps: 

(1) Prove that the distance D is small with high probability; 

(2) Prove that a suitable multiple of the random projection P//P™~ r has an 
almost full Gaussian (thus also spherical) measure. 

4.3. The distance D from the center of the facet to a random subspace. We 

shall first relate D, the distance to the affine subspace Pnaff(P), to the distance to 
the linear subspace Pnlin(P). Equivalently, we compute the length of the projection 
onto Pnlin(F). 

Lemma 4.1. 



l|PEnlin(F)<7||2 - y r + D 2 

Proof. Let / be the multiple of the vector Peniin( F) such that / — 9 is orthogonal 
to 9. Such a multiple exists and is unique, as this is a two-dimensional problem. 

r f 

P fl 

Enlin(FU ' 




Then / G Pnaff(P). Notice that D = \\f — 9\\2- By the similarity of the triangles 
with the vertices (0, 9, Pg:niin(F)0) and (Q,f,0), we conclude that 



\\P E n MF )0h ~ ^— f - ^2 Ph 

because ||#||2 = \fr. This completes the proof. ■ 

The length of the projection of a fixed vector onto a random subspace in Lemma f4.ll 
is well known. The asymptotically sharp estimate was computed by S. Artstein PP, 
but we will be satisfied with a much weaker elementary estimate, see e.g. (30 15.2.2. 

Lemma 4.2. Let 9 G and let G be a random subspace in Gd,k- Then 

Prob{ C y| ||0|| 2 < IIP?0|| 2 < C\[\ \\9\\ 2 \ > 1 - 2e~ ck . 
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We apply this lemma for G = E n lin(F), which is a random subspace in the 
Grassmanian of (^ + l)-dimensional subspaces of lin(F). Since dimlin(i ? ) = m—r+1, 
we have 

Prob{||P £nlin(F) 0|| 2 > c ] ] m l + r 1 +1 \\0h} > 1 - 2e~ d . 
Together with Lemma 14. II this gives 

Probjl) < cVm - > 1 - 2e" d . (4.4) 



Note that y/m — r is the radius of the Euclidean ball circumscribed on the facet 
F. The statement D < ^Jm — r would only tell us that the random subspace E 
intersects the circumscribed ball, not yet the facet itself. The ratio r/l in Q4.4JI will 
be chosen logarithmically small, which will force E intersect also the facet F. 

4.4. Gaussian measure of random projections of the cube. By (j4.3|) and 

man, 

P > [ a H (—?—\r~P H B™- r ') dv{H) - 2e~ d . 



la ,, ^ \An — r V r 

We can replace the spherical measure an by the Gaussian measure 7# via a simple 
lemma: 

Lemma 4.3. Let K be a star-shaped set in M. d . Then 

ld (cVd ■ K) - e- d < a d ^(K) < j d (CVd ■ K) • (1 + e~ d ). 

Proof. Passing to polar coordinates, by the rotational invariance of the Gaussian 
measure we see that there exists a probability measure \x on M + so that the Gaussian 
measure of every set A can be computed as L + ^(A) dfi(t), where a* denotes the 
normalized Lebesgue measure on the Euclidean sphere of radius t inR d . Since K is 
star-shaped, ^{K) is a non- increasing function of t. Hence 

rCVd 

ld{K) > / a\K) dfi(t) > a c ^ d (K) ■ 7d (CVdB d ) 
Jo 

and 

r-cy/d 



ld{K) < 



/•cva poo 

/ dfi{t) + a cVd (K) / dn(t) < j d (cVd ■ B d ) + a cVd (K). 

JO Jcyfd 



The classical large deviation inequalities imply jd{cVd-B d ) < e d and j d (CVdB d ) > 
1 — e~ d /2. Using the above argument for c\[d ■ K, we conclude that j d (cvd ■ K) < 
e- d + a d _i(K) and ld {CVd-K) > a d ^(K) ■ {1 - e~ d /2). ■ 

Using Lemma 14.31 in the space H of dimension d = m — R, we obtain 



J m — v, m — FL 
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By choosing the absolute constant c in the assumption r < cm appropriately small, 
we can assume that 2r < R < m/2. Thus 

P > j ^(cMphB™-^ du{H) - 2e~ cR . (4.5) 

We now compute the Gaussian measure of random projections of the cube. 

Proposition 4.4. Let H be a random subspace in G„ jn _fc, k < n/2. Then the 
inequality 

lH [c ] fh^P H B^)>l-e~ ck 

holds with probability at least 1 — e~ ck in the Grassmanian. 

The proof of this estimate will follow from the concentration of Gaussian measure, 
combined with the existence of a big Euclidean ball inside a random projection of 
the cube. 

Lemma 4.5 (Concentration of Gaussian measure). Let A be a measurable set in 
W 1 . Then fore > 0, 

~fn{A) > e~ £2n implies ln {A + Ce^iB^) > 1 - e"^™. 

With the stronger assumption j(A) > 1/2, this lemma is the classical concentra- 
tion inequality, see |28j 1.1. The fact that the concentration holds also for expo- 
nentially small sets follows formally by a simple extension argument that was first 
noticed by D. Amir and V. Milman in 2 , see 28 Lemma 1.1. 

The optimal result on random projections of the cube is due to Garnaev and 
Gluskin [20]. 

Theorem 4.6 (Euclidean projections of the cube 20 ). Let H be a random subspace 
in G ra ,n-fc, where k = an < n/2. Then with probability at least 1 — e~ ck in the 
Grassmanian, we have 

c(a)P H (Vn~B2) C P H (B^) C P H {V^B^) 

where 

c (°) = c \ /i — 7TT^- 
V log(l/a) 

Proof of Proposition 14.41 Let gi,g%,... be independent standard Gaussian 
random variables. Then for a suitable positive absolute constant c and for every 
< e < 1/2, 

7 „(cydog^^) = Probjmax \ 9l \ < C^k^} > (1 - e 2 /W) n > e~^ n . 

Since for every measurable set A and every subspace H one has ^h(PhA) > "f(A), 
we conclude that 

lH (cJk^P H B^ ) > e~ £2n for < e < 1/2. 
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Then by Lemma 14.51 



7tf ( C\/log - P H B^ + Ce^iPuBl ) > 1 - e^ n for < e < 1/2. (4.6) 



£ 



Theorem 14.61 tells us that for a random subspace H, if e = c^fa = c^Jk/n, then 
Euclidean ball is absorbed by the projection of the cube in Q4.6JI : 



£^P H B^cC^jlog- £ P H B^. 
Hence for a random subspace H and for e as above we have 



lH [C\ \og-P H B n \>l-e- £ n , 



1 

£ 

which completes the proof. ■ 

Coming back to (|4.. r )j) . we shall use Lemma 14.41 for a random subspace H in the 
Grassmanian G m - r m -R- We conclude that if 



J«>cJ log ?p^, (4.7) 

V r V R — r 

-cR 



then with probability at least 1 — e in the Grassmanian, 

-cR 



- M [J*P H B™->) >l-<- 



Since < ™, the choice of R in (fT3|) satisfies condition (|4"77)l . Thus (|13|) implies 

P > 1 - 3e" cR . 

This completes the proof. ■ 

5. OPTIMALITY, ROBUSTNESS, FINITE ALPHABETS 

5.1. Optimality. The logarithmic term in Theorems 11.11 and 12.11 is necessary, at 
least in the case of small r. Indeed, combining formula (|4.3|) and Lemmas I4.1M4.21 
14.31 we obtain 

P < f lH (c^P H BZ- r ) dv{H) + 2e~ cR . (5.1) 

To estimate the Gaussian measure we need the following 
Lemma 5.1. Let x\, . . .x s be vectors in M s . Then 

Is <>oj <ls{M-B^ 

where M = maxj = i r . s ll^jlb- 
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The sum in the Lemma is understood as the Minkowski sum of sets of vectors, 
A + B = {a + b\ aeA, be B}. 

Proof. Let F = span(xi, . . . x s -i) and let V = F- 1 -. Let v G V be a unit vector. 
Set Z = J2 S j=ii~ x h x i\- Then 



(j2[-x j:Xj ]J = f 7F((3- I i 1 I i]- te ) nf ) d w(t) 
i=i Jv 3=1 

= [ j F (Z + tP F x s )djv(t). 

J l—Pvx„,Pvx„] 



-Pvx 3 ,PvXs 

By Anderson's Lemma (see p9_), ^ F (Z + tP F x s ) < ^p{Z). Thus, 

s 

7-(2}-*i,Zi]) <1v{[-PvXs,Pvx s ])-1f{Z)< 1i {[-M,M])- 1f {Z). 

3=1 

The proof of the Lemma is completed by induction. 

The Gaussian measure of a projection of the cube can be estimated as follows. 
Proposition 5.2. Let H be any subspace in G nn -k, k < n/2. Then 



ATkf< PuB ~)- e ~ m ' K (5 ' 2) 

Proof. Decompose I into the disjoint union of the sets J\, ■ ■ ■ J s +i, so that each of 
the sets J\, . . . J s contains k + 1 elements and (k + l)s < n < (k + l)(s + 1). Let 
1 < j < s. Let Uj = H n (Pn^i, i G {1, . . . n} \ J?) -1 , where e±, . . .e n is the standard 
basis of M n . Then [/,- is a one-dimensional subspace of H. Set 

Xj — ^ ^ ^iPH^ij 

where the signs £j G {—1, 1} are chosen to maximize H-F^-a^/l^- Let i£ = span(xi, . . . x s 
Since PujB^o = [— x ji x j]i we S et 

p H B5 r\E = Y t [-x j ,x j ], 

3=1 

where the sum is understood in the sense of Minkowski addition. Since ||-P[7j|| = 1, 
\\ x j\\2 < C\Pk and by Lemma 15. II 



IE 



cV^l^l-x^} ) < lE (c'v^.B*) < e 



for some appropriately chosen constant c. Finally, log-concavity of the Gaussian 
measure implies that for any convex symmetric body K C H 

7h(K) <j E (KnE). 
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Combining (|5.1jl and (|5.2|) we obtain P < 2e cR , whenever R < c log (m/r). 

5.2. Robustness and codes for finite alphabets. Robustness is a well known 
property of the Basis Pursuit method. It states that the solution to (BP) is stable 
with respect to the 1-norm. Indeed, it is not hard to show that, once Theorem ll.il 
holds, the unknown vector y in Theorem 11.11 can be approximately recovered from 
y" = y' + h, where h £ R m is any additional error vector of small 1-norm (see jHj). 
Namely, the solution u to the Basis Pursuit problem 

min \\u — w lli 

satisfies 

\\u — y\\i < 4||/i||i. 

This implies a possibility of quantization of the coefficients in the process of encoding 
and yields error correcting codes over alphabets of size polynomial in n. 

The following is the (to, n, r)-error correcting code under assumption (|1.2|) . with 
input words x over the alphabet {1, . . . ,p} and the encoded words y over the al- 
phabet {1, . . . , Cpn 3//2 }. The construction is the same as in ()1.1|) ; we just introduce 
quantization. The encoder takes x £ {1, . . . ,p} n , computes y = Qx and outputs 
the y whose coefficients are the quantized coefficients of y with step Then y £ 
Yjj^-Z m n [—p^/m,p^/m\ m , which by rescaling can be identified with {1, . . . , Cpn 3 ^ 2 } 
because we can assume that m < In. The decoder takes y' G j^Z m , finds solution 
u to (BP) with Y = range(Q), inverts to x 1 = Q T u and outputs x' whose coefficients 
are the quantized coefficients of x' with step 1. 

This is indeed an (m, n, r)-error correcting code. If y' differs from y on at most 
r coordinates, this and the condition \y — y\\i < implies by the robustness that 
[| tt — y\\\ < 0.4. Hence \\x' — x\\2 = \\Q T (u — y)\\2 = — y\\2 < — y||i < 0.4. Thus 
x' = x, so the decoder recovers x from y' correctly. 

The robustness also implies a "continuity" of our error correcting codes. If the 
number of corrupted coordinates in the received message y' is bigger than r but is 
still a small fraction, then the (to, n, r)-error correcting code above can still recover 
y up to some small fraction of the coordinates. 

We hope to return to consequences of our method, in particular to robustness and 
continuity of our codes and generally to codes over finite alphabets, in a separate 
publication. 
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